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DETAILED ACTION 

1 . Claims 1 -28 and 32 are rejected. 

Election/Restrictions 

2. Applicant's election of Group I (claims 1-28 and 32) in the reply filed on 20 
February 2007 is acknowledged. Because applicant did not distinctly and specifically 
point out the supposed errors in the restriction requirement, the election has been 
treated as an election without traverse (MPEP § 81 8.03(a)). 

Information Disclosure Statement 

3. The information disclosure statement (IDS) submitted on 1 9 August 2005 was 
filed after the mailing date of the application on 31 March 2003. The submission is in 
compliance with the provisions of 37 CFR 1 .97. Accordingly, the information disclosure 
statement is being considered by the examiner. 
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Claim Rejections - 35 USC § 102 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 

4. Claims 1-28 and 32 are rejected under 35 U.S.C. 102(e) as being anticipated 
by US Pat No 6,865,567 to Oommen et al (hereafter Oommen). 

Referring to claim 1, Oommen discloses in a database system, a sampling 
method for constructing a data structure based on the contents of a database 
comprising: 

a) gathering an initial sample [first phase] of data from the database and creating 
a first data structure from said initial sample [x number of tuples] (see column 21, lines 
16-19); 

b) gathering a second sample [second phase] of data from the database (see 
column 21, lines 19-28); 

c) determining an initial sufficiency of the data gathered from the database that is 
based on a comparison of the first data structure and the second sample of data (see 
column 21, lines 19-40); and 

d) forming a resultant data structure by gathering an additional sample of data 
from the database and using the additional amount of data to form the resultant data 
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structure wherein the amount of data gathered in the additional sample is based on the 
initial sufficiency determination (see column 22, line 48 - column 23, line 17). 

Referring to claim 2, Oommen discloses the method of claim 1 wherein the 
resultant data structure is formed based on data gathered in the initial sample, the 
second sample and the additional sample (see column 21 , lines 19-40 and column 22, 
line 48 - column 23, line 1 7). ; 

Referring to claim 3, Oommen discloses the method of claim 1 wherein the first 
and resultant data structures are histograms (see column 22, lines 60-67). 

Referring to claim 4, Oommen discloses the method of claim 1 wherein the 
initial and second data samples are randomly retrieved block samples that form a first 
amount of data that is initially gathered and then divided in half to provide the initial and 
second data samples (see column 20, lines 60-67). 

Referring to claim 5, Oommen discloses the method of claim 4 wherein the 
initial and second data samples are sorted and used to form two histograms (see Fig 
11). 

Referring to claim 6, Oommen discloses the method of claim 5 wherein an error 
metric of the two histograms are formed by cross correlating the contents of the two 
histograms to determine the initial sufficiency (see Fig 1 9). 

Referring to claim 7, Oommen discloses the method of claim 6 wherein the 
initial and second data samples are further sub-divided to form sub-samples used to 
form other histograms of differing sample sizes that are cross correlated to find an error 
metric relating to said differing sample sizes (see Fig 19). 
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Referring to claim 8, Oommen discloses the method of claim 6 wherein the 
initial and second data samples are further sub-divided to form additional sub-samples 
of smaller size that are used to form other histograms that are cross correlated for use 
in finding an error metric relating to sample sizes for use in determining a size of the 
additional sample of data to gather from the database (see Fig 11). 

Referring to claim 9, Oommen discloses the method of claim 4 additionally 
comprising estimating distinct values of an attribute of the initial and second samples by 
eliminating records from the blocks that are duplicated within a given block and 
estimating distinct values by categorizing attributes as rarely or frequently occurring 
within the database (see column 7, lines 40-49). 

Referring to claim 10, Oommen discloses a computer readable medium for 
performing computer instructions to implement the method of claim 1 (see column 117, 
lines 12-24). 

Referring to claim 11, Oommen discloses a database system for constructing 
histograms based on sampling the contents of the database comprising: 

a) a database management component that gathers block size data segments 
from the database which in aggregate form a first sample of data having a first size [first 
phase -x number of tuples] (see column 21, lines 16-19); 

b) a histogram construction component that forms a first histogram from the first 
sample of data (see Fig 11); and 
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c) a correlation component that determines an initial sufficiency of the first 
sample of data gathered from the database based on a comparison of the first 
histogram and data from the first sample of data (see column 21, lines 19-40); 

d) wherein said database management component gathers an additional sample 
of data used by said histogram construction component in creating a resultant 
histogram and the size of the additional sample is based on the initial sufficiency 
determination (see column 22, line 48 - column 23, line 17). 

Referring to claim 12, Oommen discloses the system of claim 1 1 wherein the 
resultant histogram is formed by the histogram construction component based on data 
gathered in the first sample of data and the additional data (see column 22, lines 60-67). 

Referring to claim 13, Oommen discloses the system of claim 1 1 wherein the 
first sample of data and the additional sample of data are randomly retrieved block 
samples (see column 20, lines 60-67). 

Referring to claim 14, Oommen discloses the system of claim 1 1 wherein 
histogram construction component sorts the data in said first sample of data as it 
constructs the first histogram (see column 22, lines 60-67 and Fig 11). 

Referring to claim 15, Oommen discloses the system of claim 1 1 wherein the 
correlation component determines an error metric by cross correlating the contents of 
the first histogram with other data in said first sample of data to determine the initial 
sufficiency (see Fig 11 and Fig 19). 

Referring to claim 16, Oommen discloses the system of claim 15 wherein the 
first sample of data is sub-divided to form sub-samples used to form histograms of 
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differing sizes that are cross correlated to find an error metric relating to said differing 
sample sizes (see Fig 1 9). 

Referring to claim 17, Oommen discloses the system of claim 15 wherein the 
first sample of data is sub-divided to form additional sub-samples of smaller size that 
are used to form other histograms that are cross correlated for use in finding an error 
metric relating to sample sizes for use in determining a size of the additional sample of 
data to gather from the database (see Fig 1 1 ). 

Referring to claim 18, Oommen discloses in a database system, a sampling 
method for constructing a histogram based on the contents of a database comprising: 

a) gathering an initial sample [first phase with x number of tuples] (see column 
21, lines 16-19 and column 22, lines 60-67) of data from the database and creating a 
histogram from said initial sample; 

b) gathering a second sample of data from the database for comparison with said 
first histogram [second phase] (see column 21, lines 19-40); 

c) determining an initial sufficiency of the data gathered from the database that is 
based on a comparison of the second sample with the first histogram (see column 21, 
lines 19-40); and 

d) if the determination of initial sufficiency indicates the data in said initial and 
second samples is adequate to represent the database, combining the initial and 
second samples to form a resultant histogram, but if the determination of initial 
sufficiency indicates the initial and second samples are inadequate to represent the 
database, gathering an additional data sample to combine with the initial and second 
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samples to form the resultant histogram wherein a size of the additional data sample is 
based on the initial sufficiency determination (see column 22, line 48 - column 23, line 
17). 

Referring to claim 19, Oommen discloses the method of claim 18 wherein the 
data is gathered in blocks from random storage locations within the database (see 
column 20, lines 60-67). 

Referring to claim 20, Oommen discloses in a database system, a system for 
constructing a data structure based on the contents of a database comprising: 

a) means for gathering an initial sample [first phase] of data from the database 
and creating a first data structure [histogram] from said initial sample (see column 21, 
lines 16-19 and column 22, lines 60-67); 

b) means for determining an initial sufficiency of the data gathered from the 
database that is based on a comparison of the first data structure and other data in the 
initial sample not used to create the first data structure (see column 21 , lines 19-40); 
and 

c) means for forming a resultant data structure by gathering an additional sample 
of data from the database and using the additional amount of data to form the resultant 
data structure wherein the amount of data gathered in the additional sample is based on 
the initial sufficiency determination (see column 22, line 48 - column 23, line 17). 

Referring to claim 21 , Oommen discloses the system of claim 20 wherein the 
resultant data structure is formed based on data gathered in the initial sample and the 
additional sample (see column 21 , lines 19-40 and column 23, line 17). 
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Referring to claim 22, Oommen discloses the system of claim 21 wherein the 
first and resultant data structures are histograms (see column 22, lines 60-67). 

Referring to claim 23, Oommen discloses the system of claim 20 wherein the 
initial data sample is made up of randomly retrieved block samples that form a first 
amount of data that is divided in half to provide data to form the data structure and data 
to cross correlate against the first data structure (see column 20, lines 60-67). 

Referring to claim 24, Oommen discloses the system of claim 23 wherein the 
initial data samples is sorted and used to form two histograms (see Fig 1 1 ). 

Referring to claim 25, Oommen discloses the system of claim 24 wherein an 
error metric of the two histograms are formed by cross correlating the contents of the 
two histograms to determine the initial sufficiency (see Fig 19). 

Referring to claim 26, Oommen discloses the system of claim 25 wherein the 
initial data sample is further sub-divided to form sub-samples used to form other 
histograms of differing sample sizes that are cross correlated to find an error metric 
relating to said differing sample sizes (see Fig 19). 

Referring to claim 27, Oommen discloses the system of claim 26 wherein the 
initial and second data samples are further sub-divided to form additional sub-samples 
of smaller size that are used to form other histograms that are cross correlated for use 
in finding an error metric relating to sample sizes for use in determining a size of the 
additional sample of data to gather from the database (see Fig 11). 

Referring to claim 28, Oommen discloses the system of claim 24 additionally 
comprising means for estimating distinct values of an attribute of the initial and second 
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samples by eliminating records from the blocks that are duplicated within a given block 
and estimating distinct values by categorizing attributes as rarely or frequently occurring 
within the database (see column 7, lines 40-49). 

Referring to claim 29, Oommen discloses a computer readable medium for 
performing computer instructions to implement the method of claim 20 (see column 117, 
lines 12-24). 

Conclusion 

5. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

• US Patent No 6,785,684 titled "Apparatus and Method for Determining Clustering 
Factor in a Database using Block Level Sampling" 
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